Noisy Networks for Exploration

نویسندگان

  • Meire Fortunato
  • Mohammad Gheshlaghi Azar
  • Bilal Piot
  • Jacob Menick
  • Ian Osband
  • Alex Graves
  • Vlad Mnih
  • Rémi Munos
  • Demis Hassabis
  • Olivier Pietquin
  • Charles Blundell
  • Shane Legg
چکیده

We introduce NoisyNet, a deep reinforcement learning agent with parametric noise added to its weights, and show that the induced stochasticity of the agent’s policy can be used to aid efficient exploration. The parameters of the noise are learned with gradient descent along with the remaining network weights. NoisyNet is straightforward to implement and adds little computational overhead. We find that replacing the conventional exploration heuristics for A3C, DQN and dueling agents (entropy reward and -greedy respectively) with NoisyNet yields substantially higher scores for a wide range of Atari games, in some cases advancing the agent from sub to super-human performance.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

On the effect of low-quality node observation on learning over incremental adaptive networks

In this paper, we study the impact of low-quality node on the performance of incremental least mean square (ILMS) adaptive networks. Adaptive networks involve many nodes with adaptation and learning capabilities. Low-quality mode in the performance of a node in a practical sensor network is modeled by the observation of pure noise (its observation noise) that leads to an unreliable measurement....

متن کامل

A method to solve the problem of missing data, outlier data and noisy data in order to improve the performance of human and information interaction

Abstract Purpose: Errors in data collection and failure to pay attention to data that are noisy in the collection process for any reason cause problems in data-based analysis and, as a result, wrong decision-making. Therefore, solving the problem of missing or noisy data before processing and analysis is of vital importance in analytical systems. The purpose of this paper is to provide a metho...

متن کامل

An Information-Theoretic Discussion of Convolutional Bottleneck Features for Robust Speech Recognition

Convolutional Neural Networks (CNNs) have been shown their performance in speech recognition systems for extracting features, and also acoustic modeling. In addition, CNNs have been used for robust speech recognition and competitive results have been reported. Convolutive Bottleneck Network (CBN) is a kind of CNNs which has a bottleneck layer among its fully connected layers. The bottleneck fea...

متن کامل

H∞ Sampled-Data Controller Design for Stochastic Genetic Regulatory Networks

Artificially regulating gene expression is an important step in developing new treatment for system-level disease such as cancer. In this paper, we propose a method to regulate gene expression based on sampled-data measurements of gene products concentrations. Inherent noisy behaviour of Gene regulatory networks are modeled with stochastic nonlinear differential equation. To synthesize feed...

متن کامل

CRFA-CRBM: a hybrid technique for anomaly recognition in regional geochemical exploration; case study: Dehsalm area, east of Iran

Identification of geochemical anomalies is a significant step during regional geochemical exploration. In this matter, new techniques have been developed based on deep learning networks. These simple-structure-networks act like our brains on processing the data by simulating deep layers of thinking. In this paper, a hybrid compositional-deep learning technique was applied to identify the anomal...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • CoRR

دوره abs/1706.10295  شماره 

صفحات  -

تاریخ انتشار 2017